Mutagenic probability estimation of chemical compounds by a novel molecular electrophilicity vector and support vector machine

نویسندگان

  • Mingyue Zheng
  • Zhiguo Liu
  • Chunxia Xue
  • Weiliang Zhu
  • Kaixian Chen
  • Xiaomin Luo
  • Hualiang Jiang
چکیده

MOTIVATION Mutagenicity is among the toxicological end points that pose the highest concern. The accelerated pace of drug discovery has heightened the need for efficient prediction methods. Currently, most available tools fall short of the desired degree of accuracy, and can only provide a binary classification. It is of significance to develop a discriminative and informative model for the mutagenicity prediction. RESULTS Here we developed a mutagenic probability prediction model addressing the problem, based on datasets covering a large chemical space. A novel molecular electrophilicity vector (MEV) is first devised to represent the structure profile of chemical compounds. An extended support vector machine (SVM) method is then used to derive the posterior probabilistic estimation of mutagenicity from the MEVs of the training set. The results show that our model gives a better performance than TOPKAT (http://www.accelrys.com) and other previously published methods. In addition, a confidence level related to the prediction can be provided, which may help people make more flexible decisions on chemical ordering or synthesis. AVAILABILITY The binary program (ZGTOX_1.1) based on our model and samples of input datasets on Windows PC are available at http://dddc.ac.cn/adme upon request from the authors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Genetic Algorithm Based Support Vector Machine Model in Second Virial Coefficient Prediction of Pure Compounds

In this work, a Genetic Algorithm boosted Least Square Support Vector Machine model by a set of linear equations instead of a quadratic program, which is improved version of Support Vector Machine model, was used for estimation of 98 pure compounds second virial coefficient. Compounds were classified to the different groups. Finest parameters were obtained by Genetic Algorithm method ...

متن کامل

QSAR Study of 17β-HSD3 Inhibitors by Genetic Algorithm-Support Vector Machine as a Target Receptor for the Treatment of Prostate Cancer

The 17β-HSD3 enzyme plays a key role in treatment of prostate cancer and small inhibitorscan be used to efficiently target it. In the present study, the multiple linear regression (MLR),and support vector machine (SVM) methods were used to interpret the chemical structuralfunctionality against the inhibition activity of some 17β-HSD3inhibitors. Chemical structuralinformation were described thro...

متن کامل

Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM

Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...

متن کامل

QSAR Study of 17β-HSD3 Inhibitors by Genetic Algorithm-Support Vector Machine as a Target Receptor for the Treatment of Prostate Cancer

The 17β-HSD3 enzyme plays a key role in treatment of prostate cancer and small inhibitorscan be used to efficiently target it. In the present study, the multiple linear regression (MLR),and support vector machine (SVM) methods were used to interpret the chemical structuralfunctionality against the inhibition activity of some 17β-HSD3inhibitors. Chemical structuralinformation were described thro...

متن کامل

Support vector regression with random output variable and probabilistic constraints

Support Vector Regression (SVR) solves regression problems based on the concept of Support Vector Machine (SVM). In this paper, a new model of SVR with probabilistic constraints is proposed that any of output data and bias are considered the random variables with uniform probability functions. Using the new proposed method, the optimal hyperplane regression can be obtained by solving a quadrati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 22 17  شماره 

صفحات  -

تاریخ انتشار 2006